Introduction to Python 3

Python is a modern programming language that

  • is open source
  • is interpreted
    • interpreters exist for most platforms
  • is multi-paradigm (incl. object-oriented)
  • comes with batteries included whenever possible

Versions

Version 2 of the Python language (2.7 is the current minor version) is what made Python popular. However it was far from perfect, and version 3 of Python fixes many of the most glaring design flaws in Python 2.

Because version 2 gained popularity rapidly, it has taken over 10 years for version 3 to gain foothold. This is the first time CSC gives introduction to Python course using Python 3.

Three levels of Python

There are 3 levels of functionality you can use in Python

  • the built-in parts
    • the language itself that is used to write programs
  • the standard library
    • these cover many common tasks in programming in general, e.g.
      • file system and operating system abstraction
      • reading standardized file formats (zip, xml, csv, etc.)
      • most common data communications protocols (HTTP and email protocols)
      • more data types, programming libraries
  • the Python ecosystem, mostly available via the Python Package Index PyPI
    • tens of thousands of packages of varying quality
    • libraries for
      • numeric computation (e.g. NumPy)
      • machine learning (e.g. scikit-learn)
      • HTTP frameworks (e.g. Django)
      • natural language analysis (e.g. nltk)
      • data visualization

The core or built-in parts of Python is relatively small and we will cover that first.

The typical way to write python programs is to write it in script files that end in *.py and that can be run with the python command. We will get to that later but first we use this Jupyter Notebook to go over the basics of the language.

Syntax

First a few motivational words from The Zen of Python

Beautiful is better than ugly.
Simple is better than complex.
Readability counts.

The design of Python aims for simplicity.

First program

The first exercise in most programming tutorials is a Hello World -program.

You can edit the code in the cell below and run it by clicking on the run-button in the above toolbar or by pressing CTRL+Enter when you have the cell in focus (surrounded by a green box).

The text between the quotation marks "" is a string. print is a function and the parameters are inside regular brackets () in a C-kind of style.


In [ ]:
print("hello world!")

These exercises are run in this notebook environment, but you could just as easily copy the text below to a file called hello.py and run it with the command

$ python hello.py
hello world!

Extra: compare this with a hello world program in some other programming language that you know. Is it simpler or more complex? What kinds of design decisions have to have been made in order for the example to be this simple?

Getting help

The built-in function help() will show you interactive documentation about most Python objects when you're inside an interpreter.

If you want to know all the members of an object (more about objects and classes later) you can call the dir() function.


In [ ]:
help(print)

Variables and data types

Variable is something that can change in the execution of a program. It is referenced by a name.

In Python variable names

  • may contain letters, numbers or underscores
  • start with a a letter or underscore (but not with a number!)
  • are case sensitive

Underscores in the beginning or end of a variable are part of idiomatic coding style that hints things to the reader of the code. We will get to that later.

Try them out below:


In [ ]:
hello_example_1 = "hello world!" # comments are marked with the #-sign
hello_example_1 = 5
hello_example_1 
# in a Jupyter notebook if the cell ends 
# with a single variable, the system will print the 
# value for you

Python is a dynamically typed, strongly typed language. It's OK not to understand the terms completely. They are simply mentioned because they carry very specific meaning to experienced programmers.

In practice this means that:

  • variables (and their types) don't need to be declared
  • trying to use a variable of an incorrect type will result in errors

Data types

Python has a small set of basic data types, that are grouped into groups that we will introduce. All variables in python have a type and you can use the built-in method type() to check the type of a variable.

  • boolean: is a data type that can be either True or False (note capitalization of first letter)

  • Numeric types, that represent numbers

    • int: integers, not limited in length
    • float: floating point numbers, like doubles in C, with similar caveats
    • complex: complex numbers, represented by j (not covered in this tutorial)
  • Sequences:

    • str: String, a sequence of Unicode characters in the range U+0000 - U+10FFFF
    • bytes: a sequence of integers in the range 0-255, i.e. raw data
    • byte array: like bytes, but mutable
    • list: a mutable ordered sequence of variables
    • tuple: an immutable ordered sequence of variables
  • Sets

    • set: an unordered collection of unique objects
    • frozen set: like set, but immutable
  • Mappings

    • dict: a dictionary, also called a hashmap

Python is dynamically typed, which means that the data types does not need to be declared, it is determined at run time.

Python is strongly typed, which means that it typically does not attempt to coerce a data type to another. For instance it is not possible to concatenate a string and a number, which is often valid in many languages. The number needs to be converted into a string explicitly.

The typing in Python is called duck typing. It is sufficient to implement the functions required and not necessary to explicitly implement an interface like in e.g. Java or C#.

Each of the abovementioned types is also a built-in function that returns objects of said type.

Sequences, sets and mappings are often iterated over. More on this later.


In [ ]:
value = 5
value2 = value + 1

my_string = "hello "
my_string = my_string + str(value2) # you can attempt the same without converting to string
print(my_string)

Mutable and immutable data types

Some data types are mutable and some are immutable.

Mutable data types can be changed after they are created for example:

  • a list can be appended to
  • a byte in a byte array can be altered
  • a set can be added to
  • a dict can be added to

Immutable data types cannot be changed after they are created. Any operations on the data types will return a new instance of the same type, that is different. Typically this new value then needs to be assigned to a variable.

Immutable Mutable
numeric types (int, float, etc.)
tuple list
str byte array
frozen set set
dict

Only immutable data types can be the keys in a dict.


In [ ]:
# mutable examples
dict_ = {"key": "value"}
dict_["key2"] = "value2"
print(dict_)

list_ = ["egg", "sausage", "bacon"]
list2 = list_
list_.append("spam")
print(list_)

# variables are just pointers to objects in memory
# for mutable types all references point to the same object that has changed
print(list2)

# immutable examples

str_ = "hello world!"
print(str_.replace("l", ""))
print(str_)

tuple_ = (4, 5, 6)
print(tuple_ + (6,7))
print(tuple_)

Lists

Lists are created using [] brackets or the list() constructor, which accepts many types of other objects. There is no requirement for all the objects in the list to be of the same type. This is a consequence of the duck typing mentioned earlier.

Lists support multiple types of indexing.


In [ ]:
my_list = [1, 2, 3, 4]

print(my_list[0]) # indexing starts from 0
print(my_list[1:3]) # so-called slice syntax selects a part of a list
print(my_list[-1]) # negative indices are also permitted, -1 is the last index
print(my_list[-3:-1]) # also in slicing

Lists can be appended to using several types of syntax


In [ ]:
my_list = [1, 2]
my_list.append(3) # modifies in place, takes a single item
print(my_list)
my_list.extend([4, 5]) # takes another list
print(my_list)
another_list = my_list + [6, 7] # makes a copy
print(another_list)
print(my_list)

Dictionaries

Dictionaries are also accessed by using the []-brackets. A dict is accessed by key.

The dict also contains a get() method that takes in a default value to return if the key is not present.

It is assigned to using the bracket notation. If a key exists, the value is overriden.


In [ ]:
my_dict = {1: 2, "key": "value"}
print(my_dict[1])
print(my_dict["key"])
print(my_dict.get("im_not_there", "default"))

my_dict["key2"] = "i was just inserted"
print(my_dict["key2"])

Tuples

A comma defines a tuple. for example

a,b

is a valid tuple. It's convention use parentheses to make the presence of a tuple more explicit,

(a, b)

but the parentheses are in no way required.

Python does automatic packing and unpacking of tuples, as is illustrated by the following example.


In [ ]:
a, b = 1, 2
a, b = b, a
##Check what the values of a and b are now